Skip to content

BUG: groupby transform fillna produces wrong result #31101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

fujiaxiang
Copy link
Member

@fujiaxiang fujiaxiang commented Jan 17, 2020

@fujiaxiang fujiaxiang requested a review from WillAyd January 18, 2020 05:59
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah would be great to generalize these changes, this is a very narrow patch

@fujiaxiang fujiaxiang requested review from WillAyd and jreback January 22, 2020 03:28
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking better!

@fujiaxiang
Copy link
Member Author

Switched to use transformation_func defined in conftest.
Have noticed a few issues there.

  1. Why is corrwith a transformation function? Shouldn't it be a reduction?
  2. groupby transformation with pad, backfill and cumcount raises AttributeError: 'Series' object has no attribute 'xxx'. Have ignored these two cases in the test for now. This issue is already mentioned in BUG: Groupby.transform('cumcount') fails #27472.

@fujiaxiang fujiaxiang requested a review from WillAyd January 22, 2020 15:25
@fujiaxiang fujiaxiang closed this Jan 22, 2020
@fujiaxiang fujiaxiang reopened this Jan 22, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, ex @WillAyd commnents.

@jreback jreback added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Jan 23, 2020
fujiaxiang and others added 2 commits January 24, 2020 11:32
Use xfail for cases that are not applicable yet

Co-Authored-By: William Ayd <[email protected]>
@pep8speaks
Copy link

pep8speaks commented Jan 24, 2020

Hello @fujiaxiang! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-01-24 05:12:24 UTC

@jreback jreback added this to the 1.1 milestone Jan 24, 2020
@fujiaxiang fujiaxiang requested review from WillAyd and jreback January 24, 2020 05:13
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@WillAyd WillAyd merged commit aad377d into pandas-dev:master Jan 24, 2020
@WillAyd
Copy link
Member

WillAyd commented Jan 24, 2020

Thanks @fujiaxiang keep the PRs coming!

@fujiaxiang fujiaxiang deleted the groupby_transform_fillna_wrong_result branch May 29, 2020 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected behaviour of groupby.transform when using 'fillna'
5 participants